home *** CD-ROM | disk | FTP | other *** search
- NAME
- perltrap - Perl traps for the unwary
-
- DESCRIPTION
- The biggest trap of all is forgetting to use the -w
- switch; see the perlrun manpage. The second biggest trap
- is not making your entire program runnable under use
- strict.
-
- Awk Traps
-
- Accustomed awk users should take special note of the
- following:
-
- o The English module, loaded via
-
- use English;
-
- allows you to refer to special variables (like $RS) as
- though they were in awk; see the perlvar manpage for
- details.
-
- o Semicolons are required after all simple statements in
- Perl (except at the end of a block). Newline is not a
- statement delimiter.
-
- o Curly brackets are required on ifs and whiles.
-
- o Variables begin with "$" or "@" in Perl.
-
- o Arrays index from 0. Likewise string positions in
- substr() and index().
-
- o You have to decide whether your array has numeric or
- string indices.
-
- o Associative array values do not spring into existence
- upon mere reference.
-
- o You have to decide whether you want to use string or
- numeric comparisons.
-
- o Reading an input line does not split it for you. You
- get to split it yourself to an array. And split()
- operator has different arguments.
-
- o The current input line is normally in $_, not $0. It
- generally does not have the newline stripped. ($0 is
- the name of the program executed.) See the perlvar
- manpage.
-
- o $<digit> does not refer to fields--it refers to
- substrings matched by the last match pattern.
-
- o The print() statement does not add field and record
- separators unless you set $, and $.. You can set $OFS
- and $ORS if you're using the English module.
-
- o You must open your files before you print to them.
-
- o The range operator is "..", not comma. The comma
- operator works as in C.
-
- o The match operator is "=~", not "~". ("~" is the
- one's complement operator, as in C.)
-
- o The exponentiation operator is "**", not "^". "^" is
- the XOR operator, as in C. (You know, one could get
- the feeling that awk is basically incompatible with
- C.)
-
- o The concatenation operator is ".", not the null
- string. (Using the null string would render /pat/
- /pat/ unparsable, since the third slash would be
- interpreted as a division operator--the tokener is in
- fact slightly context sensitive for operators like
- "/", "?", and ">". And in fact, "." itself can be the
- beginning of a number.)
-
- o The next, exit, and continue keywords work
- differently.
-
- o The following variables work differently:
-
- Awk Perl
- ARGC $#ARGV or scalar @ARGV
- ARGV[0] $0
- FILENAME $ARGV
- FNR $. - something
- FS (whatever you like)
- NF $#Fld, or some such
- NR $.
- OFMT $#
- OFS $,
- ORS $\
- RLENGTH length($&)
- RS $/
- RSTART length($`)
- SUBSEP $;
-
-
- o You cannot set $RS to a pattern, only a string.
-
- o When in doubt, run the awk construct through a2p and
- see what it gives you.
-
-
-
- C Traps
-
- Cerebral C programmers should take note of the following:
-
- o Curly brackets are required on if's and while's.
-
- o You must use elsif rather than else if.
-
- o The break and continue keywords from C become in Perl
- last and next, respectively. Unlike in C, these do
- NOT work within a do { } while construct.
-
- o There's no switch statement. (But it's easy to build
- one on the fly.)
-
- o Variables begin with "$" or "@" in Perl.
-
- o printf() does not implement the "*" format for
- interpolating field widths, but it's trivial to use
- interpolation of double-quoted strings to achieve the
- same effect.
-
- o Comments begin with "#", not "/*".
-
- o You can't take the address of anything, although a
- similar operator in Perl 5 is the backslash, which
- creates a reference.
-
- o ARGV must be capitalized. $ARGV[0] is C's argv[1],
- and argv[0] ends up in $0.
-
- o System calls such as link(), unlink(), rename(), etc.
- return nonzero for success, not 0.
-
- o Signal handlers deal with signal names, not numbers.
- Use kill -l to find their names on your system.
-
- Sed Traps
-
- Seasoned sed programmers should take note of the
- following:
-
- o Backreferences in substitutions use "$" rather than
- "\".
-
- o The pattern matching metacharacters "(", ")", and "|"
- do not have backslashes in front.
-
- o The range operator is ..., rather than comma.
-
- Shell Traps
-
- Sharp shell programmers should take note of the following:
-
- o The backtick operator does variable interpretation
- without regard to the presence of single quotes in the
- command.
-
- o The backtick operator does no translation of the
- return value, unlike csh.
-
- o Shells (especially csh) do several levels of
- substitution on each command line. Perl does
- substitution only in certain constructs such as double
- quotes, backticks, angle brackets, and search
- patterns.
-
- o Shells interpret scripts a little bit at a time. Perl
- compiles the entire program before executing it
- (except for BEGIN blocks, which execute at compile
- time).
-
- o The arguments are available via @ARGV, not $1, $2,
- etc.
-
- o The environment is not automatically made available as
- separate scalar variables.
-
- Perl Traps
-
- Practicing Perl Programmers should take note of the
- following:
-
- o Remember that many operations behave differently in a
- list context than they do in a scalar one. See the
- perldata manpage for details.
-
- o Avoid barewords if you can, especially all lower-case
- ones. You can't tell just by looking at it whether a
- bareword is a function or a string. By using quotes
- on strings and parens on function calls, you won't
- ever get them confused.
-
- o You cannot discern from mere inspection which built-
- ins are unary operators (like chop() and chdir()) and
- which are list operators (like print() and unlink()).
- (User-defined subroutines can only be list operators,
- never unary ones.) See the perlop manpage.
-
- o People have a hard time remembering that some
- functions default to $_, or @ARGV, or whatever, but
- that others which you might expect to do not.
-
- o The <FH> construct is not the name of the filehandle,
- it is a readline operation on that handle. The data
- read is only assigned to $_ if the file read is the
- sole condition in a while loop:
-
- while (<FH>) { }
- while ($_ = <FH>) { }..
- <FH>; # data discarded!
-
-
- o Remember not to use "=" when you need "=~"; these two
- constructs are quite different:
-
- $x = /foo/;
- $x =~ /foo/;
-
-
- o The do {} construct isn't a real loop that you can use
- loop control on.
-
- o Use my() for local variables whenever you can get away
- with it (but see the perlform manpage for where you
- can't). Using local() actually gives a local value to
- a global variable, which leaves you open to unforeseen
- side-effects of dynamic scoping.
-
- o If you localize an exported variable in a module, its
- exported value will not change. The local name
- becomes an alias to a new value but the external name
- is still an alias for the original.
-
- Perl4 Traps
-
- Penitent Perl 4 Programmers should take note of the
- following incompatible changes that occurred between
- release 4 and release 5:
-
- o @ now always interpolates an array in double-quotish
- strings. Some programs may now need to use backslash
- to protect any @ that shouldn't interpolate.
-
- o Barewords that used to look like strings to Perl will
- now look like subroutine calls if a subroutine by that
- name is defined before the compiler sees them. For
- example:
-
- sub SeeYa { die "Hasta la vista, baby!" }
- $SIG{'QUIT'} = SeeYa;
-
- In Perl 4, that set the signal handler; in Perl 5, it
- actually calls the function! You may use the -w
- switch to find such places.
-
- o Symbols starting with _ are no longer forced into
- package main, except for $_ itself (and @_, etc.).
-
- o Double-colon is now a valid package separator in an
- identifier. Thus these behave differently in perl4
- vs. perl5:
- print "$a::$b::$c\n";
- print "$var::abc::xyz\n";
-
-
- o s'$lhs'$rhs' now does no interpolation on either side.
- It used to interpolate $lhs but not $rhs.
-
- o The second and third arguments of splice() are now
- evaluated in scalar context (as the book says) rather
- than list context.
-
- o These are now semantic errors because of precedence:
-
- shift @list + 20;
- $n = keys %map + 20;
-
- Because if that were to work, then this couldn't:
-
- sleep $dormancy + 20;
-
-
- o The precedence of assignment operators is now the same
- as the precedence of assignment. Perl 4 mistakenly
- gave them the precedence of the associated operator.
- So you now must parenthesize them in expressions like
-
- /foo/ ? ($a += 2) : ($a -= 2);
-
- Otherwise
-
- /foo/ ? $a += 2 : $a -= 2;
-
- would be erroneously parsed as
-
- (/foo/ ? $a += 2 : $a) -= 2;
-
- On the other hand,
-
- $a += /foo/ ? 1 : 2;
-
- now works as a C programmer would expect.
-
- o open FOO || die is now incorrect. You need parens
- around the filehandle. While temporarily supported,
- using such a construct will generate a non-fatal (but
- non-suppressible) warning.
-
- o The elements of argument lists for formats are now
- evaluated in list context. This means you can
- interpolate list values now.
-
- o You can't do a goto into a block that is optimized
- away. Darn.
-
- o It is no longer syntactically legal to use whitespace
- as the name of a variable, or as a delimiter for any
- kind of quote construct. Double darn.
-
- o The caller() function now returns a false value in a
- scalar context if there is no caller. This lets
- library files determine if they're being required.
-
- o m//g now attaches its state to the searched string
- rather than the regular expression.
-
- o reverse is no longer allowed as the name of a sort
- subroutine.
-
- o taintperl is no longer a separate executable. There
- is now a -T switch to turn on tainting when it isn't
- turned on automatically.
-
- o Double-quoted strings may no longer end with an
- unescaped $ or @.
-
- o The archaic while/if BLOCK BLOCK syntax is no longer
- supported.
-
- o Negative array subscripts now count from the end of
- the array.
-
- o The comma operator in a scalar context is now
- guaranteed to give a scalar context to its arguments.
-
- o The ** operator now binds more tightly than unary
- minus. It was documented to work this way before, but
- didn't.
-
- o Setting $#array lower now discards array elements.
-
- o delete() is not guaranteed to return the old value for
- tie()d arrays, since this capability may be onerous
- for some modules to implement.
-
- o The construct "this is $$x" used to interpolate the
- pid at that point, but now tries to dereference $x.
- $$ by itself still works fine, however.
-
- o The meaning of foreach has changed slightly when it is
- iterating over a list which is not an array. This
- used to assign the list to a temporary array, but no
- longer does so (for efficiency). This means that
- you'll now be iterating over the actual values, not
- over copies of the values. Modifications to the loop
- variable can change the original values. To retain
- Perl 4 semantics you need to assign your list
- explicitly to a temporary array and then iterate over
- that. For example, you might need to change
- foreach $var (grep /x/, @list) { ... }
-
- to
-
- foreach $var (my @tmp = grep /x/, @list) { ... }
-
- Otherwise changing $var will clobber the values of
- @list. (This most often happens when you use $_ for
- the loop variable, and call subroutines in the loop
- that don't properly localize $_.)
-
- o Some error messages will be different.
-
- o Some bugs may have been inadvertently removed.
-